Data Mining in Macroeconomic Data Sets

نویسنده

  • Ping Chen
چکیده

National Economic Input-Output (EIO) data describes the monetary transactions among economic sectors. The monetary transactions among these sectors form a weighted bi-directional network from a supply sector to a demand sector and the weight is equivalent to the transaction value between them. In this research, we study the properties of this network and identify patterns of inter-sector dependence evolution by investigating the historical EIO data over the years 1947-1982. Here we make the following contributions: The first is the discovery that economic transactions (the distribution of the weight) are highly skewed, but follow the double Pareto-lognormal distribution (dPlN). The second contribution is the design of a new method, " Multiple Steps of Pattern REcognition in skewed DAta " (M-SPREAD) which identifies patterns and clusters despite the skewness of the data set. We applied our methods on the EIO data and we found interesting and explainable patterns, such as correlations among sectors, various evolution patterns within different transaction scales, outlier sectors and outlier time-stamps. The US Economic Input-Output (EIO) accounts [11] show how industries provide input to, and use output from, other industries to produce Gross Domestic Product (GDP). These accounts provide detailed information on the flows of the goods and services of industries in US dollars, such as the purchase of coal from the coal mining sector by the power generation sector. Graphically, these sectors form a weighted bi-directional network through the economic transactions between them. Individual sectors become the vertices of the network; the edges are generated by the economic transaction relationships from the supply sector to the demand sector. The weight of the edges is measured by the dollar amount of monetary transactions between them. Figure 1 shows an example of part of the economy network composed by three economic sectors and the amount of transactions between each pair of them. Learning the web properties of the economy network, including the web structure, the distribution of the size of transactions as well as the evolution of the network can benefit the understanding of the formation and movement of the interconnections among these sectors and is therefore helpful for the prediction of the change of the economic system in the future. Monetary connections and commodity supply demand transactions determine the interdependence among economic sectors. The existence of supply-demand connections makes the dysfunction of one economic sector jeopardize for the normal operation of the other sector. The disruption of any …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

Application of Benford’s Law in Analyzing Geotechnical Data

Benford’s law predicts the frequency of the first digit of numbers met in a wide range of naturally occurring phenomena. In data sets, following Benford’s law, numbers are started with a small leading digit more often than those with a large leading digit. This law can be used as a tool for detecting fraud and abnormally in the number sets and any fabricated number sets. This can be used as an ...

متن کامل

Comparing Medical Comorbidities Between Opioid and Cocaine Users: A Data Mining Approach

Background: Prescription drug monitoring programs (PDMPs) are instrumental in controlling opioid misuse,but opioid users have increasingly shifted to cocaine, creating a different set of medical problems. Whileopioid use results in multiple medical comorbidities, findings of the existing studies reported singlecomorbidities rather...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006